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Preface 


The  Problem 

Twinkle,  twinkle  little  star.  How  I  wonder  what  you  are, 

Up  above  the  world  so  high,  Like  a  diamond  in  the  sky...  [38] 


The  Observation 

If  the  Theory  of  making  Telescopes  could  at  length  be  fully  brought  into  Practice, 
yet  there  would  be  certain  Bounds  beyond  which  Telescopes  could  not  perform. 
For  the  Air  through  which  we  look  upon  the  Stars,  is  in  perpetual  Tremor;  as 
may  be  seen  by  the  tremulous  Motion  of  Shadows  cast  from  high  Towers,  and 
by  the  twinkling  of  the  fix’d  Stars.  But  these  Stars  do  not  twinkle  when  viewed 
through  Telescopes  which  have  large  apertures.  For  the  Rays  of  Light  which  pass 
through  divers  parts  of  the  aperture,  tremble  each  of  them  apart,  and  by  means 
of  their  various  and  sometimes  contrary  Tremors,  fall  at  one  and  the  same  time 
upon  different  points  in  the  bottom  of  the  Eye,  and  their  trembling  Motions  are 
too  quick  to  be  perceived  severally.  And  all  these  illuminated  Points  constitute 
one  broad  lucid  Point,  composed  of  those  many  trembling  Points  confusedly  and 
insensibly  mixed  with  one  another  by  very  short  and  swift  Tremors,  and  thereby 
cause  the  Star  to  appear  broader  than  it  is,  and  without  any  trembling  of  the 
whole.  Long  Telescopes  may  cause  Objects  to  appear  brighter  and  larger  than 
short  ones  can  do,  but  they  cannot  be  so  formed  as  to  take  away  that  confusion 
of  the  Rays  which  arises  from  the  Tremors  of  the  Atmosphere.  The  only  Remedy 
is  a  most  serene  and  quiet  Air,  such  as  may  perhaps  be  found  on  the  tops  of  the 
highest  Mountains  above  the  grosser  Clouds  [26]. 


A  Solution 

Sir  Isaac  Newton’s  keen  observation  provides  an  insightful  introduction  to  the 
fundamental  limitations  of  astronomical  imaging  imposed  by  atmospheric  turbu¬ 
lence.  In  recent  times,  automated  systems  have  been  employed  to  sharpen  the 
image  of  a  celestial  object  before  it  is  recorded  [23]  and  computer  software  has 
been  developed  to  mitigate  the  blurring  effects  of  atmospheric  turbulence  [21]. 
The  technique  developed  in  this  thesis  seeks  to  characterize  the  cumulative  effects 
of  the  atmosphere  between  the  celestial  body  and  adaptive  optical  telescope  [36]. 
Specifically,  this  technique  estimates  wave  front  slopes  over  each  subaperture  of 
a  Hartmann-type  wave  front  sensor. 
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Abstract 

Current  methods  for  estimating  the  wave  front  slope  at  the  pupil  of  a  telescope  equiped 
with  a  Hartmann-type  wave  front  sensor  (H-WFS)  are  based  on  a  simple  centroid  calculation 
of  the  intensity  distributions  (spots)  recorded  in  each  subaperture  of  the  H-WFS.  The  centroid 
method  does  not  include  any  knowledge  concerning  correlation  properties  of  the  slopes  over 
the  subapertures  or  the  amount  of  light  collected  by  the  telescope  and  diverted  to  the  H-WFS 
for  wave  front  reconstmction  purposes.  This  thesis  devises  a  maximum  likelihood  (ML) 
estimation  of  the  spot  centroids  by  incorporating  statistical  knowledge  of  the  spot  shifts.  The 
light  level  in  each  subaperture  and  the  relative  spot  size  is  also  employed  by  the  shift  estimator. 
The  shift  estimator  is  found  to  be  unbiased  and  is  upper  bounded  by  the  mean  squared  error 
performance  exhibited  by  the  classical  centroid  technique. 
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Maximum  Likelihood  Estimation  of  Wave  Front  Slopes  using  a 

Hartmann-type  Sensor 


I.  Background 

1.1  Introduction 

Many  years  have  passed  since  Isaac  Newton  recorded  his  observations  of  the  “Tremors  of 
the  Atmosphere”  [26].  These  “Tremors”  are  responsible  for  the  scintillation  effects  commonly 
called  twinkling.  We  now  know  that  the  diurnal  heating  and  cooling  of  the  Earth’s  surface  is 
the  root  cause  of  atmospheric  turbulence  [32] .  This  turbulence  reduces  the  effective  resolution 
of  a  telescope  and  consequently  blurs  imagery  of  celestial  objects  [8], 

Although  atmospheric  turbulence  is  unavoidable,  the  distortion  effects  (scintillation 
and  blur)  can  be  reduced  by  proper  site  selection  [32].  Placing  an  observatory  on  top  of 
a  mountain,  where  the  air  is  “serene  and  quiet,”  can  lessen  the  deleterious  effects  of  the 
atmosphere.  For  example,  the  Air  Force  Maui  Optical  Station  (AMOS)  is  located  atop  the 
10,000-ft  Mount  Haleakala  on  the  island  of  Maui,  Hawaii  [2].  Unfortunately,  this  “Remedy” 
proposed  by  Newton,  falls  short  of  achieving  the  theoretical  resolution  of  large  diameter 
telescopes.  Although  larger  diameter  telescopes  allow  one  to  view  dimmer  objects,  they 
do  not  significantly  improve  the  resolution.  In  recent  years,  adaptive  optical  (AO)  imaging 
systems  have  been  successfully  employed  to  compensate  for  the  distortions  induced  by  the 
atmosphere  [46]. 

AO  imaging  systems  compensate  for  atmospheric  effects  before  the  image  is  formed 
by  “realigning”  the  light  rays.  Without  adaptive  optics,  the  theoretical  resolution  achieved 
by  modem  telescopes  is  on  par  with  those  of  the  amateur  astronomer-only  a  couple  of 
arc  seconds  [32].  In  1953,  Horace  Babcock  suggested  that  the  atmospheric  blurring  of 
astronomical  imagery  could  be  reduced  by  mechanical  means-an  AO  imaging  system  [17].  In 


1 


1956,  Robert  Leighton,  a  self-proclaimed  amateur  astronomer,  recorded  the  clearest  images  of 
Jupiter,  Saturn,  and  Mars  published  at  that  time  [23].  Leighton  attached  a  “simple”  first-order 
AO  system  to  the  60-inch  reflecting  telescope  at  the  Mount  Wilson  Observatory. 

AO  imaging  systems  technology  is  an  important  area  of  current  research  aimed  at 
improving  image  quality  [1].  The  performance  of  an  AO  imaging  system  is  fundamentally 
limited  by  many  factors.  The  most  significant  factors  are  the  finite  amount  of  light  diverted 
to  the  wave  front  sensor  (WFS)  and  the  finite  spatial  sampling  of  the  incident  wave  front 
by  the  WFS  [36].  Cost  also  limits  the  performance  of  AO  imaging  systems.  As  the  cost  of 
these  sophisticated  systems  increases,  many  organizations  and  amateur  astronomers  may  turn 
to  hybrid  approaches.  A  hybrid  approach  may  be  used  to  supplement  deficiencies  in  an  AO 
imaging  system  by  combining  mechanical  pre-compensation  from  an  AO  system  and  image 
post-processing  [27,  34]. 

Image  post-processing  techniques  range  from  the  simple  inverse  filter  [13,  18]  to  so¬ 
phisticated  blind  deconvolution  methods  [19,  22,  41].  A  relatively  recent  hybrid  technique 
known  as  deconvolution  from  wave  front  sensing  (DWFS)  was  proposed  by  Fried  [9]  in  1987 
and  extended  by  Primot  etal.  [31]  in  1990.  DWFS  explicitly  uses  WFS  data  from  an  imaging 
system  to  improve  image  resolution. 

The  Air  Force  mission  of  imaging  exo-atmospheric  objects  provides  the  central  mo¬ 
tivation  for  developing  techniques  to  overcome  the  deletrious  effects  of  the  atmosphere  on 
ground-based  imaging  [6]. 

The  purpose  of  the  following  sections  is  to  illustrate  the  importance  of  estimating  the 
wave  front  slope  using  a  Hartmann-type  WFS  (H-WFS)  for  wavefront  reconstruction  purposes. 
The  first  pertinent  background  topic  discussed  conceptualizes  the  nature  of  atmospheric  tur¬ 
bulence.  A  brief  review  of  the  canonical  AO  imaging  system  and  H-WFS  is  followed  by 
a  discussion  of  two  image  restoration  techniques.  The  problem  statement  and  the  proposed 
solution  conclude  this  chapter. 
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1.2  Atmospheric  turbulence 

It  is  well  known  that  atmospheric  turbulence  degrades  astronomical  imaging.  The 
cause,  effects,  and  a  measure  of  atmospheric  turbulence  are  important  to  understanding  the 
significance  of  the  astronomical  imaging  problem.  The  stochastic  mechanism  causing  the 
turbulence  indicates  that  a  statistical  solution  is  needed.  A  study  of  the  effects  tell  us  that 
neutralizing  the  degradation  induced  by  the  atmosphere  is  possible.  The  measure  allows  us  to 
analytically  characterize  the  overall  turbulence  effects  with  a  single  parameter. 

The  fundamental  mechanism  responsible  for  the  atmospheric  turbulence  phenomenon 
is  the  diurnal  heating  and  cooling  of  the  Earth’s  surface.  Uneven  temperature  distributions 
create  large  scale  inhomogeneities  (eddies)  in  the  refractive  indices  of  the  air  [32].  Eddies 
are  homogeneous  pockets  or  regions  of  air  [15].  Kolmogorov  asserted  that  large-scale  eddies 
transmit  energy  without  loss  to  progressively  smaller  eddies  causing  optical  turbulence  [29]. 
Hence,  the  random  fluctuations  of  the  refractive  indices  of  air  indicate  that  a  stochastic 
description  of  the  turbulence  is  needed.  As  a  planar  wave  front  propagates  through  the 
atmosphere,  the  turbulence  modulates  both  the  phase  and  amplitude  [8]. 

The  most  significant  effect  of  atmospheric  turbulence  is  that  it  imparts  a  random  tilt 
onto  the  wave  front  [7].  Additionally,  the  wave  front  phase  perturbations  are  generally  greater 
than  the  amplitude  distortions.  The  amplitude  distortions  appear  as  scintillation  or  twinkling. 
The  overall  tilt  imparted  onto  the  wave  front  is  evident  by  apparent  object  motion  in  the  image 
plane  of  the  telescope.  This  motion  produces  the  decrease  in  resolution  evident  in  astronomical 
imaging.  One  method  of  modeling  the  phase  effects  on  a  finer  scale  is  to  consider  how  light 
rays  interact  with  the  atmosphere.  When  an  optical  wave  front  passes  through  the  varying 
sized  eddies,  the  individual  light  rays  experience  varying  amounts  of  retardation.  In  other 
words,  the  planar  wave  front  is  dimpled  and  tilted. 

Fried  derived  a  coherence  length  measure,  r0,  which  gauges  the  overall  strength  of 
turbulence-induced  perturbations  [8].  When  the  telescope’s  aperture  is  greater  than  tq,  the 
achievable  resolution  is  equal  to  that  of  a  telescope  of  size  ro  in  the  absence  of  atmospheric 
turbulence  [8].  For  an  aperture  of  dimension  r0,  the  nearly  diffraction  limited  image  appears 
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to  dance  around  on  the  detector  as  the  atmosphere  evolves  [23].  Additionally,  ro  roughly 
corresponds  to  the  spatial  dimensions  of  a  subaperture  within  which  a  phase  error  can  be 
measured  and  subsequently  corrected  by  a  single  tilt  mirror  [29].  The  next  section  discusses 
the  components  and  operation  of  adaptive  optical  imaging  systems  employed  to  remove  the 
tilt  and  dimples  from  the  wave  front. 

1.3  Adaptive  optics 

The  light  emitted  or  reflected  by  a  celestial  object  reaching  the  Earth’s  atmosphere 
is  essentially  a  plane  wave.  As  the  optical  wave  front  passes  through  the  atmosphere,  the 
planar  nature  of  the  wave  front  is  perturbed.  The  basic  premise  of  adaptive  optics  is  to 
mechanically  deform  a  reflective  surface  in  the  optical  train  of  the  telescope  to  compensate  for 
the  atmospheric  effects.  In  other  words,  the  AO  system  attempts  to  remove  the  tilt  and  dimples 
in  the  plane  wave  front.  Specifically,  AO  imaging  systems  mitigate  the  induced  phase  errors 
by  sensing  the  perturbations  with  a  WFS  and  then  adding  the  conjugate  phase  to  the  perturbed 
wave  front  with  a  deformable  mirror  (DM)  [17].  The  canonical  AO  system  is  illustrated  in 
Fig.  (1).  The  controller  maps  the  phase  errors  sensed  by  the  WFS  to  actuator  commands  which 
modify  the  DM’s  figure  [25].  The  controller  employs  a  wave  front  reconstruction  algorithm 
to  derive  the  proper  commands  sent  to  the  actuators  [34], 

In  spite  of  the  amazing  achievements  of  current  AO  imaging  systems  [46],  these  systems 
are  fundamentally  limited  by  many  factors.  Roggemann  and  Welsh  state  that  “the  problem 
of  obtaining  enough  light  for  accurate  wave  front  sensing  has  been  the  most  significant  factor 
limiting  the  application  of  AO  technology  to  ground  based  imaging”  [36].  The  finite  amount 
of  light  gathered  by  the  AO  system  to  drive  the  WFS  is  strongly  linked  to  performance  limiting 
shot  noise  and  anisoplanatism  [36].  Shot  noise  or  photon  noise  is  due  to  the  random  rate  of 
arrival  of  photons  [15, 49].  The  technique  proposed  in  this  thesis  attempts  to  overcome  this 
limitation  by  utilizing  all  of  the  light  collected  by  the  WFS  to  estimate  the  wave  front  slopes 
over  each  subaperture.  Anisoplanatism  results  when  the  angular  separation  of  the  object  and 
guide  star  is  greater  than  the  isoplanatic  angle,  which  is  on  the  order  of  a  few  arc  seconds  for 
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light  from  telescope 


Figure  1.  Simplified  adaptive  optics  imaging  system  with  the  primary  ray  paths  shown 

visible  light  wavelengths  [10].  Other  notable  limitations  include  the  finite  sampling  of  the 
wave  front  by  the  WFS  [48],  the  limited  number  of  degrees-of-freedom  on  the  DM  [33],  and 
delays  between  sensing  and  correcting  the  phase  aberrations  [20].  The  next  section  discusses 
a  post-processing  application  which  utilizes  wave  front  slope  estimation. 

1.4  Deconvolution  from  wave  front  sensing 

The  previous  section  on  AO  imaging  described  how  astronomical  imagery  is  improved 
before  it  is  recorded.  This  section  concerns  wave  front  slope  estimation  in  an  image  post¬ 
processing  application.  Deconvolution  from  wave  front  sensing  (DWFS)  is  an  image  post¬ 
processing  technique  which  uses  the  phase  distortion  information  data  collected  by  the  WFS 
to  estimate  the  point  spread  function  (PSF)  [31,  35,  51].  The  image  is  then  restored  by  de¬ 
convolving  the  PSF  estimate  from  the  detected  image.  The  deconvolution  process  effectively 
increases  the  resolution  in  the  image.  The  PSF  is  a  measure  of  the  blurring  induced  by  the 
optical  system,  which  is  dominated  by  the  atmospheric  turbulence.  Hence,  DWFS  image 
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restoration  performance  is  directly  linked  to  the  quality  of  the  wave  front  slope  estimation. 
The  technique  of  applying  the  DWFS  method  to  images  recorded  with  the  aid  of  an  AO 
system  is  called  compensated  DWFS  (CDWFS)  [47].  In  this  thesis,  a  Hartmann-type  sensor 
is  employed  to  gather  light  from  the  perturbed  wave  front. 

1.5  Wavefront  sensing  and  reconstruction 

A  Hartmann-type  WFS  (H-WFS)  consists  of  an  array  of  lenslets  with  an  array  of 
detectors  in  the  back  focal  plane  of  each  lenslet.  Each  lenslet  forms  a  subaperture.  Each 
subaperture  lenslet  forms  an  intensity  distribution  or  spot  in  the  detector.  The  offset  of  the 
spot  from  the  optical  axis  is  termed  the  shift.  The  shift  is  directly  proportional  to  the  average 
wave  front  tilt  or  slope  within  that  subaperture  [49].  The  data  recorded  by  the  H-WFS  is  a 
composite  set  of  images.  This  composite  image  recorded  in  the  back  focal  plane  of  the  H-WFS 
is  called  the  WFS  image. 

Wave  front  reconstruction  involves  converting  the  spot  location  in  each  subaperture  of 
the  H-WFS  to  estimate  the  phase  of  the  wave  front.  The  finite  number  of  photons  detected  by 
the  H-WFS  limits  the  accuracy  of  a  centroiding  procedure.  Hence  employing  the  underlying 
statistical  properties  of  the  wave  front  slope  correlations  from  subaperture  to  subaperture  and 
the  known  light  levels  should  increase  wave  front  reconstruction  performance. 

In  summary,  wave  front  reconstruction  from  measured  wave  front  gradients  (slopes) 
is  vital  to  real-time  imaging  systems  and  image  post-processing  techniques  used  to  restore 
atmospherically  blurred  imagery. 

1.6  The  problem 

Develop  an  algorithm  to  maximize  the  use  of  the  data  recorded  by  the  H-WFS  to 
optimally  estimate  the  wave  front  slope  over  each  subaperture  of  the  H-WFS. 
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1. 7  Proposed  solution 

1.7.1  Introduction.  The  effects  of  atmospheric  turbulence  can  be  modeled  by  an 
equivalent  representation  of  the  wave  front  at  the  telescope  pupil  [15],  see  Fig.  (2).  The 
phase  aberrations  at  the  pupil  are  evident  in  the  subimages  as  spot  shifts.  Therefore,  the 
data  in  the  short-exposure  image  collected  by  the  H-WFS  is  directly  related  to  the  aberrated 
phase  of  the  wave  front  at  the  telescope’s  pupil  and  can  be  used  to  estimate  the  wave  front 
slope.  The  composite  image  recorded  in  the  back  focal  plane  of  the  H-WFS  includes  all  of 
the  subapertures  and  is  called  the  WFS  image.  In  the  following  subsections,  a  conventional 
method  and  a  maximum  likelihood  (ML)  technique  for  calculating  the  spot  shifts  in  the  WFS 
image  are  discussed  and  a  few  comments  on  the  utility  of  this  ML  approach  are  presented. 

1.7.2  Conventional  method.  A  conventional  method  of  estimating  the  wave  front 
slope  in  the  pupil  of  the  imaging  system  from  the  WFS  data  is  based  on  the  location  of  the 
spot  centroid  in  each  subaperture  image.  The  offset  of  the  spot  centroid  from  the  expected 
center  position,  defined  by  the  optical  axis  of  the  lenslet,  is  due  to  the  random  wave  front  tilt 
over  the  corresponding  subaperture  [24].  A  linear  phase  over  the  subaperture  causes  a  simple 
shift  in  the  location  of  the  intensity  distribution  in  the  image  plane  [14].  As  atmospheric 
turbulence  is  never  constant,  the  lenslet  overlays  a  continuous  sequence  of  images  at  random 
offset  locations  in  the  image  plane.  For  short-exposure  images,  the  wave  front  slope  over  each 
subaperture  is  correlated  with  the  neighboring  subapertures.  This  correlation  of  the  wave  front 
tilts  is  used  in  an  ML  estimation  approach  to  estimate  the  wave  front  slopes  from  descrete 
WFS  image  data. 

1.7.3  Maximum  likelihood  approach.  In  order  to  compute  an  optimal  estimate  for 
the  wave  front  slope,  a  model  of  the  WFS  subaperture  slopes  in  the  focal  plane  of  the  recording 
device  for  each  subaperture  must  be  developed.  The  image  created  in  a  specific  subaperture  is 
called  a  subimage.  Next,  the  a  priori  knowledge  of  the  wave  front  slope  statistics  or  correlation 
properties  is  factored  into  the  ML  technique  employed  to  determine  the  most  likely  value  for 
the  wave  front  slope.  The  following  paragraphs  briefly  sketch  the  details  of  this  approach. 
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Figure  2.  Wave  front  distortion,  wave  front  sensing,  and  imaging  processes 


Using  the  linear  systems  framework,  the  incoherent  imaging  model  equates  the  image 
intensity  to  the  convolution  of  the  object  intensity  and  the  point  spread  function  (PSF)  [14]. 
When  a  point  source  is  imaged  through  the  turbulent  atmosphere,  the  blurred  image  is  a 
measure  of  how  the  light  from  the  source  is  spread.  The  point-source  image  is  the  PSF.  The 
detected  image  is  recorded  as  an  array  of  pixels,  thus  the  detected  images  are  essentially 
sampled  versions  of  the  actual  intensity  distribution  imaged  in  the  back  focal  plane.  Since 
the  detection  of  light  is  a  random  process,  it  is  necessary  to  employ  a  stochastic  model  of  the 
detected  image  taking  into  account  shot  noise.  Poisson  statistics  are  well  suited  to  describing 
shot  noise  effects  [28].  The  image  can  be  described  by  Poisson  statistics  if  we  assume  that 
the  Poisson  parameter  or  rate  function  is  known  [30].  The  rate  function  corresponds  to  the 
expected  intensity  of  the  image.  Additionally,  the  optical  detection  in  one  pixel  is  independent 
from  the  detection  in  another.  With  these  assumptions,  the  probability  density  function  (PDF) 
for  each  subimage  is  simply  a  product  of  PDFs  associated  with  the  array  of  pixels. 

The  joint  PDF  of  the  composite  WFS  image  is  derived  after  developing  the  detected 
image  model.  In  the  absence  of  atmospheric  turbulence,  the  PSF  and  the  rate  function  are 
known.  In  reality,  the  rate  function  is  a  random  process  and  optical  detection  is  doubly 
stochastic  [15,  44].  Hence  the  WFS  image  model  is  doubly  stochastic.  Since  phase  effects 
are  dominant,  the  amplitude  effects  are  ignored  [7].  The  joint  PDF  corresponding  to  the 
WFS  image  is  the  product  of  the  conditional  PDF  calculated  for  each  subimage  and  a  PDF 
describing  the  effects  of  the  wave  front  phase  over  each  subaperture.  Modeling  the  phase 
perturbation  over  each  subaperture  as  a  tilt  in  the  wave  front  results  in  a  shift  of  the  spot  in 
each  subimage.  The  phase  tilt  is  modeled  as  a  zero  mean  Gaussian  random  variable  [15].  The 
shift  in  the  spot  location  is  linearly  related  to  the  tilt  and  thus  it  is  also  a  zero  mean  Gaussian 
random  variable.  Since  the  phase  perturbations  across  the  pupil  are  known  to  be  spatially 
and  temporally  correlated,  so  are  the  shifts  in  each  subimage.  Because  only  a  single  frame  of 
short-exposure  imagery  is  used  in  the  estimation  process,  the  temporal  correlation  effects  are 
ignored  [50].  With  knowledge  of  the  H-WFS  subaperture  geometry,  we  can  deterministically 
model  the  rate  function  and  let  the  randomness  be  expressed  as  the  shift  in  the  spot  centroid 
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of  the  subimage.  Finally,  with  the  joint  PDF  determined,  the  ML  approach  is  used  to  calculate 
the  wave  front  slope  estimate  over  each  subaperture. 

1.7.4  Utility  of  the  maximum  likelihood  approach.  In  comparison  to  conventional 
techniques  which  only  use  the  information  gathered  by  the  H-WFS,  the  ML  technique  derived 
in  this  thesis  incorporates  a  priori  knowledge  to  improve  spot  centroid  estimation  accuracy.  In 
conventional  techniques  the  shift  parameter  measurement  in  one  subaperture  is  independent  of 
the  calculation  in  all  of  the  other  subapertures.  In  the  presence  of  excessive  shot  noise  brought 
on  by  low  light  conditions,  the  true  location  of  the  spot  may  be  inaccurately  predicted  by 
conventional  techniques.  When  imaging  dim  objects,  the  amount  of  light  reaching  the  optical 
detectors  of  the  H-WFS  may  be  on  par  with  the  background  noise.  In  the  ML  model-based 
approach,  the  joint  PDF  describing  the  WFS  image  employs  every  pixel  in  the  WFS  image 
to  optimally  estimate  the  shift  parameter  in  each  subimage  using  ML  estimation.  Since  the 
wave  front  slopes  over  the  subapertures  are  correlated,  the  shift  parameters  are  not  calculated 
independent  of  each  other,  thus  a  certain  amount  of  robustness  is  introduced  into  the  shift 
parameter  calculation. 

1.8  Thesis  overview 

Chapter  II  covers  the  theory  related  to  the  modeling  of  the  data  recorded  by  the  H- 
WFS.  Chapter  III  covers  the  slope  parameter  estimation  using  the  ML  technique.  Chapter  IV 
presents  the  properties  of  the  estimators  derived  in  Chapter  HI.  Chapter  V  gives  a  summary 
on  the  thesis  methodology,  conclusions,  and  recommendations. 
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II.  Theory  and  model  derivation 


2.1  Introduction 

The  purpose  of  this  research  was  to  estimate  the  slope  of  the  distorted  wave  front 
with  measurements  from  a  Hartmann-type  wave  front  sensor  (H-WFS).  The  wave  front  slope 
estimate  can  then  be  used  to  reconstruct  the  phase  distortion  modeled  at  the  telescope  pupil  for 
image  restoration.  As  the  wave  front  passes  through  the  atmosphere,  both  its  amplitude  and 
phase  are  perturbed.  The  most  significant  distortion  is  born  by  the  phase  [7],  implying  that 
scintillation  is  negligible  within  a  subaperture.  Scintillation  from  subaperture  to  subaperture 
is  accounted  for  in  the  derivation.  The  slope  of  the  wave  front  in  each  subaperture  is  directly 
proportional  to  a  shift  in  the  centroid  of  the  detected  irradiance  distribution  [14].  Hence, 
estimating  the  centroid  shift  in  each  subimage  is  equivalent  to  estimating  the  slope  of  the  wave 
front  over  each  subaperture. 

The  fundamentally  random  nature  of  optical  detection  is  modeled  with  Poisson  statistics. 
By  allowing  the  Poisson  parameter  to  be  random,  the  Poisson  process  can  incorporate  the 
stochastic  effects  induced  by  the  atmosphere  [15].  The  Poisson  parameter  is  referred  to 
herein  as  the  rate  function  since  it  essentially  models  the  rate  at  which  photons  arrive  at  the 
image  detector  pixels.  The  rate  function  and  the  expected  image  are  shown  to  be  equivalent 
quantities.  For  a  point-source  object,  the  rate  function  is  also  equivalent  to  the  point  spread 
function  (PSF). 

The  following  sections  review  the  principles  of  maximum  likelihood  estimation  theory 
for  random  parameters  and  proceed  to  develop  the  mathematical  model  describing  the  inco¬ 
herent  image  formation  process,  including  the  assumptions  made  to  keep  the  mathematics 
tractable.  By  treating  the  H-WFS  as  a  collection  of  imaging  systems,  the  single  image  model 
is  expanded  into  the  WFS  image  model.  The  ultimate  goal  of  this  chapter  is  to  derive  the  joint 
probability  density  function,  /D,xs  (d,  xs),  where  d  represents  the  data  recorded  by  the  WFS 
and  X£  represents  the  subaperture  tilt  induced  spot  shifts. 
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2.2  Maximum  likelihood  estimation  theory 

The  most  likely  value  for  a  random  vector  parameter  T  of  a  random  vector  process  Z 
can  be  found  by  employing  the  maximum  likelihood  (ML)  technique.  When  the  probability 
density  function  (PDF)  of  the  random  vector  parameter  T  is  known,  the  ML  approach  is  called 
maximum  a  priori  (MAP)  estimation.  This  review  follows  the  development  in  Scharf  [39]. 

Consider  a  random  vector  Z  G  1ZN,  parameterized  by  a  random  vector  T  G  %N  with 
joint  PDF  /z,t(z,  t).  The  joint  PDF  can  be  written  as  a  product  of  a  conditional  PDF  and  a 
marginal  PDF, 

/z,t(M)  =  /z|T(z|t)/T(t),  (1) 


where  the  conditional  PDF  can  be  used  to  estimate  the  parameter  T  when  the  marginal  PDF 
/T(t)  is  unknown.  For  some  observed  value  z  of  the  random  variable  Z,  a  particular  value 
for  the  parameter  t  is  more  probable  than  some  other  value.  In  other  words,  the  PDF  is  at  a 
maximum  value  for  a  particular  realization  when  the  ML  estimate  of  the  parameter  is  chosen. 
This  can  be  stated  mathematically  as: 


t  =  arg 


[max  t  .J 
t  /z,t(z> t)  , 


(2) 


where  t  is  known  as  the  maximum  likelihood  estimate  of  the  random  variable  T.  In  this  sense, 
the  function 

*(t,z)  =  /z,T(z,t)  (3) 


evaluated  at  the  observed  value  z  is  termed  the  likelihood  function  and  L(  t,  z)  defined  as 


L(t,z)  =  ln{/Z)T(z,t)}  (4) 

is  called  the  log-likelihood  function.  The  log-likelihood  function  is  particularly  useful  for 
maximizing  exponential  distributions.  The  maximum  value  of  a  function  is  determined  by 
differentiating  it  with  respect  to  the  independent  variable  of  interest,  setting  the  result  equal 
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to  zero,  and  then  finding  the  roots  of  the  homogeneous  equation.  The  score  function  is  a 
short-hand  notation  for  the  gradient  of  the  log-likelihood  function 

Q 

5(t,Z)  =  dt  L(t’Z)'  (5) 

Thus  the  maximum  value  of  the  function  is  determined  by  solving 

s(t,z)  =  0,  (6) 

where  the  ML  estimate  is  denoted  t.  The  ML  invariance  property  and  several  other  properties 
used  to  characterize  the  performance  of  an  estimator  are  discussed  in  the  following  subsections. 

2.2. 1  Invariance  property.  Using  the  ML  estimate  t,  the  ML  estimate  of  a  function 
of  the  random  parameter  vector  T  can  be  computed.  If  the  random  vector  G  =  /( T),  then 
by  the  invariance  property  of  the  ML  technique,  the  ML  estimate  g  is  /( t). 

2.2.2  Estimator  properties.  Two  important  properties  used  to  gauge  the  quality  or 
performance  of  an  estimator  are  the  mean  and  covariance  of  estimate  t  [39].  The  estimator 
mean  is  used  to  determine  if  an  estimator  is  biased.  The  mean  and  covariance  are  defined  as 

mean(t)  =  £?{t}  (7) 

and 

covariance(t)  =  E  |[t  -  £(t)][t  -  E(t)]T| ,  (8) 

where  the  superscript  T  represents  the  transpose  operator,  t  is  said  to  be  an  unbiased  estimator 
of  the  random  parameter  T  if  E{ t}  -  £{T}  =  0.  The  second  property,  the  covariance  of 
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the  estimator,  shows  up  in  the  definition  for  the  error  covariance  matrix: 


C«„  =  £?{[*— T][t  —  Tf} 

=  J5{[t-B(t)][t-£(i)]T}  +  [£(t)-T][ii;(t)-T]T,  (9) 

where  the  first  term  represents  the  covariance  of  t  and  the  second  term  is  the  squared  bias. 
The  mean  squared  error  (MSE)  of  the  estimator  t  is  a  sum  of  the  diagonal  elements  of  the 
error  covariance  matrix.  The  MSE(t)  is  defined  as  the  trace  of  the  error  covariance  matrix, 

N 

MSE(t)  =  tr(Cerr)  Tn,]2}  ,  (10) 

71  =  1 

where  tr(-)  denotes  the  trace  operator. 

Now  that  we  have  discussed  the  ML  estimation  technique  and  a  few  properties  of 
estimators,  we  can  proceed  to  the  derivation  of  the  joint  PDF  describing  the  WFS  image.  The 
process  begins  by  deriving  the  joint  PDF  corresponding  to  a  single  image. 

2.3  Single  Image 

The  formation  of  a  single  image  is  a  fundamental  building  block  of  this  thesis.  The 
WFS  image  is  viewed  as  a  collection  of  single  images  related  by  the  statistics  governing 
the  wave  front  slopes  over  the  subapertures  of  the  H-WFS.  This  derivation  incorporates  the 
doubly  stochastic  nature  of  the  optical  detection.  The  following  subsections  discuss  the  image 
formation,  the  optical  detection,  the  random  rate  function,  and  the  PDF  describing  a  single 
image. 

2.3.1  Image  formation.  The  propagation  of  incoherent  light  through  the  turbulent 
atmosphere  to  the  back  focal  plane  of  a  lens  is  modeled  by  [18] 

i{x\(f)—  f  o(x')s(x  —  x1 -,(/))  dx1 ,  x  £  X,  (11) 

Jx'ex1 


14 


where  i(x;  cj>)  is  the  resultant  image  irradiance;  o( a?)  is  the  object  of  interest;  s(x  —  x'\  4>) 
is  the  space-invariant  PSF;  x  €  X  is  the  two-dimensional  (2-D)  image  plane;  x'  G  X'  is 
the  2-D  object  plane;  and  <fi  represents  the  turbulence  induced  phase  errors.  If  the  angular 
extent  of  the  object  is  sufficiently  small,  on  the  order  of  a  few  arc  seconds  [32],  then  the 
isoplanatic  assumption  is  valid  and  the  PSF  can  be  written  as  a  function  of  the  difference 
of  the  coordinates.  Hence,  incoherent  imaging  can  be  modeled  as  a  convolutional  process 
as  in  Eq.  (11).  The  image  is  detected  using  a  discrete  photon  counting  device,  such  as  a 
charged-cooled  device  (CCD)  camera  [42].  Since  the  CCD  camera  has  pixels,  the  recorded 
image  is  actually  a  sampled  version  of  the  continuous  image  intensity  distribution  focused 
onto  the  image  plane  Eq.  (11).  The  spatial  sampling  is  implicitly  modeled  by  restricting  the 
2-D  position  vector,  x,  to  the  discrete  image  (sample)  space  S,  by  writing  Eq.  (11)  as 

i[x\(j)]=  (  o(x')s(x  —  x1]^)  dx1,  xeS,  (12) 

Jx'GX1 

where  the  image  irradiance  is  denoted  by  the  symbol  i[x\  <fi]  and  the  square  brackets  indicate 
the  implicit  sampling  of  the  continuous  process  [40].  The  terms  image  irradiance,  image 
intensity  distribution,  and  expected  image  are  synonymously  used  to  refer  to  the  image  defined 
in  Eq.  (12). 

The  linear  systems  approach  provides  a  convenient  framework  for  modeling  incoherent 
imaging.  The  following  development  relates  the  shape  of  the  aperture  to  the  unaberrated  PSF. 
This  approach  closely  follows  that  of  Roggemann  and  Welsh  [36].  In  general,  the  expected 
image  is  a  function  of  the  object  and  the  PSF.  But  when  the  object  is  a  point  source,  the 
expected  image  is  equivalent  to  the  PSF  as  shown  by  replacing  o(x)  in  Eq.  (12)  with  S(x), 
yielding  [11] 

=  [  6(#)s{x-$-4>)&$,  SeS  (13) 

Jx'eX' 

=  s[x;<f>],  xes.  (14) 
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Thus  the  expected  image  is  determined  by  the  atmospheric  turbulence  and  aperture  geometry. 
The  following  development  illustrates  how  the  atmospheric  turbulence  and  aperture  geometry 
influence  the  resultant  image  of  a  point-source  object. 

For  a  single  aperture,  the  unaberrated  PSF  is  given  by  the  modulus  squared  of  the 
coherent  impulse  response,  hopt(x).  Expressed  mathematically  as 

•Sopt(^)  =  |^'opt(*^)|  >  (15) 

where  sopt(x )  is  the  unaberrated  PSF.  Whereas  the  PSF  maps  the  object  intensity  to  the  image 
plane,  the  coherent  impulse  response  maps  the  complex  field  of  the  object  to  the  complex  field 
of  the  image.  The  coherent  impulse  response  is  equal  to  the  inverse  Fourier  transform  of  the 
coherent  transfer  function  (CTF)  and  can  be  expressed  as 

h(x)  =  T~l[H{u)\  =  [  H(u)  exp{j2Tr(u  ■  x)}  du,  (16) 

JuGU 

where  T~r  is  the  inverse  Fourier  transform  operator;  H(u)  is  the  CTF;  ft  is  a  2-D  spatial 
frequency  variable;  and  u  €  U  represents  the  2-D  Fourier  domain.  The  shape  of  the  aperture 
is  described  by  the  support  of  the  pupil  function,  Wp(x),  and  is  related  to  the  CTF  by 


H(u)  =  Wp(uXavgfi),  (17) 

where  Xavg  is  the  average  wavelength  and  //  is  the  focal  length  of  the  lens.  To  include  the 
random  phase  fluctuations  of  the  atmosphere,  define  the  generalized  pupil  function  [14] 

W(x ;  0)  =  exp {-j(f>{x)}  Wp(x).  (18) 

For  this  work,  the  phase  (j>(x )  is  modeled  as 


4>{x)  ~  a  •  x 


(19) 
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where  •  is  the  vector  dot  product.  Since  the  phase  slope  is  most  likely  not  a  planar  surface,  a 
represents  the  least-mean-squares  slope  of  the  wave  front.  Thus  4>(x)  =  a  ■  x  is  a  tilted  plane 
within  the  subaperture.  This  approximation  improves  as  the  ratio  r0 / L  gets  bigger,  where 
rQ  is  the  Fried  coherence  diameter  characterizing  the  seeing  and  L  is  the  maximum  extent  of 
the  subaperture.  Rewriting  Eq.  (18)  using  the  approximation  in  Eq.  (19)  yields  the  following 
generalized  pupil  function: 


W(x ;  a)  =  exp{— j(a  •  a?)}  Wv{x). 


(20) 


Substituting  the  generalized  pupil  function  of  Eq.  (20)  into  Eq.  (17)  yields 

H(u)  =  exp {-j{a  •  u\avgft)}Wp(u\avgfi).  (21) 


Evaluating  Eq.  (16)  with  the  CTF  of  Eq.  (21)  yields  the  desired  PSF  relation  [11] 
s(x;a)  =  |.F-1[exp{— j(a  •  u\aVgfi)}  Wp(u\avgfi)]f  , 


(22) 


where  hi  x  — 


avgfl 

27T 


a  )  is  the  coherent  impulse  response  shifted  by 


avgfl  ■ 

27T 


-a. 


2.3.2  Single  recorded  image.  The  image  recorded  by  a  CCD  camera  is  subject  to  nu¬ 
merous  sources  of  error.  The  CCD  sensor  is  subject  to  photo-conversion  noise,  readout  noise, 
and  various  background  noises  [42, 43].  For  astronomical  imaging,  the  photo-conversion  and 
readout  noises  are  the  dominant  sources  of  error.  Photo-conversion  noise  characterizes  the 
random  arrival  of  photons  and  is  modeled  by  the  Poisson  random  process  d[x\  [49].  Readout 
noise  is  also  a  stochastic  process  and  is  often  modeled  with  the  Gaussian  distribution  n[x ] 
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[30].  The  recorded  image,  r[x],  is  modeled  by 


r[x]  —  d[x]  +  n[ce], 


(23) 


where  d[x\  is  the  detected  image  and  n[x]  is  a  zero  mean  Gaussian  random  process  [4].  In  this 
research,  the  detector  is  assumed  to  be  photon  noise  limited.  In  other  words,  the  randomness 
of  the  optical  detection  was  assumed  to  be  the  dominant  source  of  uncertainty.  Hence,  the 
recorded  and  detected  images  are  the  “same.” 


2.3.3  Single  image  optical  detection  model.  The  randomness  inherent  in  optical 
detection  is  accurately  modeled  by  the  Poisson  random  process  when  the  semi-classical  model 
of  photo-electric  detection  is  employed.  This  model  is  valid  because  the  three  defining 
assumptions  of  the  Poisson  point  process  are  satisfied  [15].  In  short,  the  probability  of 
observing  one  photo-event  at  a  time  is  proportional  to  the  observation  time,  detection  area 
and  intensity;  the  probability  of  more  than  one  photo-event  per  observation  time  interval  is 
negligible;  and  the  number  of  photo-events  occurring  in  non-overlapping  time  intervals  are 
statistically  independent.  Thus,  the  discrete  image  is  sufficiently  modeled  as  a  collection  of 
independent  Poisson  random  variables  [15].  Image  detection  can  be  cast  as  the  probability  of 
detecting  d[x\  photo-events  at  location  x.  This  detection  probability  is  given  by  the  following 
conditional  Poisson  PDF  [30]: 


|  A[x])  — 


A[x  —  xs]d^  exp{— A[x  —  x^]} 

4ip  1 


x  6  S, 


(24) 


where  D[x]  is  a  random  variable  characterizing  the  number  of  photo  events  at  location  x  €  <S; 
d[x]  is  a  realization  of  D[x];  A  [a:]  is  a  random  variable  characterizing  the  rate  function  at 
location  x;  X[x  —  is  a  specific  realization  of  A[x]; 


xs  = 


27T 


a 


(25) 


18 


is  the  random  shift  defined  in  Eq.  (22);  and  /(•)  is  used  to  denote  the  PDF  of  both  continuous 
and  discrete  distributions.  The  rate  function  represents  the  average  photon  count  for  a  particular 
pixel  and  can  be  expressed  as 


A[x  -  XS]  =  £/)[£] | A[x] {£>[£]}, 


(26) 


where  E{  }  is  the  expectation  operator.  The  rate  function  is  also  equivalent  to  the  expected 
image  irradiance  given  by  Eq.  (22).  Thus,  for  a  point  source,  the  rate  function  is  equivalent  to 
the  PSF,  A [•]  =  s[-].  The  joint  PDF  of  the  photo  events  for  all  of  the  pixels  in  the  image  is  a 
product  of  conditional  Poisson  PDFs  given  in  Eq.  (24), 


/d|a(<1|A)  =  JJ 
xGS 


A  [a?  -  xs]d^  exp{-A[x  —  afg]} 
d[x\\ 


(27) 


where  D  =  {D[a?]:af€c>}  and  A  =  {A  [a:]  :  x  G  <S}  are  vectors  of  random  variables,  and 
d  =  {d[x}  :  x  G  5}  and  A  =  {X[x  —  xs]  :  x  e  S,xs  E  X }  are  vector  representations  of 
specific  realizations.  The  marginal  PDF  characterizing  the  randomness  of  the  rate  function  is 
discussed  in  the  following  subsection. 


2.3.4  The  random  rate  function.  In  general,  the  rate  function  depends  on  both  the 
imaged  object  and  the  shape  of  the  aperture.  For  a  point-source  object  or  a  system  with  a  small 
aperture,  the  imaged  object  appears  as  an  unresolved  spot.  The  entire  spot  is  assumed  to  be 
contained  entirely  within  the  image  detector  area.  The  relationship  between  the  spot  location 
and  compactness  is  shown  in  Fig.  (3).  The  intensity  distribution  of  the  spot  is  controlled  by 
the  shape  of  the  aperture  and  the  relative  position  is  related  to  the  wave  front  slope.  The  spot 
is  offset  from  the  optical-axis  (center  pixel)  by  the  shift  parameter  xs,  which  is  a  realization 
of  the  random  shift  parameter  Xs-  Hence  the  shift  incorporates  the  randomness  of  the  rate 
function  and  the  conditional  PDF  of  Eq.  (27)  is  re-denoted  by  /D|*s(d|zs).  The  conditional 
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Figure  3.  Single  image  of  a  compact  spot  overlay ed  on  the  detector  pixel  array 


PDF  is  repeated  for  the  sake  of  clarity: 


A  [a?  —  exp{— A  [a?  —  a:#]} 

d[a:]! 


(28) 


where  Xg  represents  the  random  bivariate  shift  variable  and  Xg  is  a  specific  realization. 
The  wave  front  tilt  is  manifested  as  a  shift  xg  = 


^avgfi  ^  SpQt  iocatjori  in  the 

Z7T 


subimage.  The  PDF  of  the  random  shift,  xg,  is  modeled  as  a  product  of  two  independent  and 
identically  distributed  (IID)  zero  mean  Gaussian  random  variables  [8,  15].  The  zero  mean 
bivariate  normal  has  the  following  PDF: 


(29) 


ey 

where  a  is  the  variance  of  the  random  shift  Xg  and  since  xg  = 


xg 

VS 


.  I^l2  =  4  +  vs- 


2.3.5  Single  image  probability  density  function.  By  combining  the  results  of  the 
previous  subsections,  the  joint  PDF  describing  a  single  detected  image  yields 


fn,xs^^Xs)  ~  fu\Xs^^s^3cs(^s)^  (30) 

where  /D|xs  (d|^s)  is  found  using  Eq.  (28)  and  fgs  (xg)  is  given  in  Eq.  (29).  In  the  following 
section,  the  PDF  describing  the  data  recorded  by  all  of  the  subapertures  of  the  WFS  is 
developed. 
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Figure  4.  Pixel  and  subimage  labeling  system  for  the  wave  front  sensor  image.  Note  how 
the  pixels  labeled  by  Xi  and  xt>  are  in  the  same  relative  positions  in  all  of  the 
subimages. 

2.4  Wavefront  sensor  image 

The  H-WFS  can  be  modeled  as  a  set  of  correlated  diffraction  limited  imaging  systems. 
The  expression  for  the  PDF  describing  the  image  in  a  single  aperture  is  extended  to  the  case  of 
multiple  subapertures  where  the  geometry  and  arrangement  of  the  subapertures  is  completely 
arbitrary.  The  WFS  image  is  defined  as  a  set  of  J  subaperture  images  or  subimages,  with  I 
pixels  in  each  subimage  as  shown  in  Fig.  (4).  Note  that  there  are  I  =  25  pixels  in  each  of  the 
J  =  4  subimages  in  this  example. 

The  pixels  in  each  of  the  subimages  are  modeled  by  a  collection  of  I  independent 
Poisson  random  variables  as  in  Eq.  (24).  Let  the  photon  count  in  a  particular  pixel  be  denoted 
by  dj[xi],  where  i  refers  to  the  pixel  location  in  the  jth  subimage  and  Xi  €  S.  The  position 
vector  &i  locates  the  ith  pixel  in  any  subimage  as  shown  in  Fig.  (4).  The  shift  parameters  xSj 
and  xs.,  are  displayed  in  Fig.  (5). 

The  joint  PDF  describing  the  WFS  image  is  a  generalization  of  Eq.  (30).  The  PDF  is 
a  product  of  the  I  x  J  conditionally  Poisson  PDFs  [30]  and  the  joint  PDF  of  the  subaperture 
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j  th  subimage 


f  th  subimage 


:  j 

_L 

la 

Figure  5 .  Wave  front  sensor  image  with  compact  spots  overlayed  on  the  detector  pixel  arrays . 
This  figure  is  intended  to  show  the  randomness  of  the  shift  parameters. 

shifts,  Xs  =  (XSl Xsj), 


/D,Xs(d)X5)  =  /D|xs(d|x5)  /xs(xs)»  (31) 


where 


w»i*o  =  n  n 


(32) 


The  joint  PDF  of  the  random  shift  vector,  X5  =  (X^, . ..  ,Xsj),  is  derived  by 
extending  the  bivariate  Gaussian  of  Eq.  (29)  from  one  parameter  to  multiple  parameters.  The 
joint  PDF  of  Xs  is  the  zero  mean  multivariate  normal  density 


/xs(x5)  =  (2tt)  j|R|  1/2  exp  j— ^x^R  xxs| ,  (33) 
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where  |R|  is  the  determinant  of  the  block  correlation  matrix  R  [28].  The  correlation  matrix 
is  a  block  matrix  with  the  following  form: 


R  = 


Ri,i 

•  ’  •  R-i  ,j 

Rj,i 

•  •  • 

where  each  block  is  given  by 


Rjj'  —  E 


( 

- 

- 

T  \ 

\ 

xSj 

l 

J? 

i _ 

l 

1 _ 

* 

(34) 


(35) 


and  xsj  and  ysj  are  the  coordinates  of  the  spot  centroid  in  the  jth  subimage  [5].  In  Eq.  (33), 
xs  is  actually  a  2  J-length  column  vector  written  as 


x5  = 


xSi 

XSi 

ys ! 

XSj  _ 

XSj 

ysj 

(36) 


which  is  compatible  with  the  size  of  the  2  J  x  2  J  correlation  matrix  of  Eq.  (34).  The  cross 
correlation  between  the  x  and  y  coordinate  centroid  shifts  are  zero  when  j  =  j'  since  the  shifts 
are  assumed  to  be  independent  zero  mean  Gaussian  random  variables  for  a  single  subaperture 
[28],  i.e. 


3  J 


(37) 


where  er2  is  the  variance  of  the  jth  random  shift  in  the  both  coordinate  directions:  X s}  and  Ysi . 
In  general,  however,  the  cross  correlations  R^  in  Eq.  (35)  are  nonzero.  Since  R  £  7?,2Jx2J 
and  Rj;-i  =  R^,  R-1  is  also  real  and  symmetric  [45].  Additionally,  R  is  positive  definite, 
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which  means  that  zTRz  >  0,  Vz,  thus  R-1  is  also  positive  definite  [28].  Note  that  the  specific 
correlation  properties  of  the  spot  shift  or  equivalently  the  wave  front  slope  was  not  essential 
to  this  derivation. 


2.5  Conclusions 

The  linear  systems  framework  was  used  in  modeling  the  incoherent  imaging  process. 
Optical  detection  was  modeled  by  a  conditionally  Poisson  process,  where  the  shift  parameter 
vector  is  a  random  process.  Equation  (31)  will  be  used  to  find  the  most  likely  value  for  the 
random  vector  X$  for  an  observed  WFS  image,  d,  in  the  next  chapter  using  ML  estimation 
to  optimally  determine  the  wave  front  slope  in  the  pupil  of  the  telescope. 
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III.  Maximum  likelihood  parameter  estimation 

3.1  Introduction 

The  purpose  of  the  previous  chapter  was  to  introduce  the  estimation  theory  and  model 
the  image  captured  by  the  wave  front  sensor  (WFS)  with  a  probability  density  function  (PDF). 
The  PDF  describing  the  WFS  image  serves  the  role  of  the  likelihood  function  in  maximum 
likelihood  (ML)  estimation.  The  logarithm  is  a  nondecreasing  function  of  the  argument. 
Taking  the  logarithm  of  exponential  functions  greatly  simplifies  the  maximization  procedure. 
Thus  the  log-likelihood  function  will  be  maximized  with  respect  to  the  shift  parameters.  The 
bulk  of  the  derivation  is  divided  into  two  parts.  In  the  first  part,  the  score  function  is  derived 
for  an  arbitrary  rate  function  and  manipulated  into  a  form  where  the  shift  parameters  for  the 
J  subimages  are  grouped  into  the  shift  parameter  vector  xg.  In  the  second  part,  the  score 
function  is  evaluated  for  a  scaled  Gaussian  rate  function.  The  Gaussian  function  adequately 
models  the  main  lobe  of  the  intensity  distribution,  which  is  equivalent  to  the  rate  function  as 
stated  earlier,  provided  the  volume  of  the  main  lobe  is  approximately  equal  [52]. 

3.2  Log-likelihood  function 

The  likelihood  function  is  a  scalar  defined  by  Eq.  (31)  and  the  log-likelihood  function 
is  formed  by  taking  the  natural  logarithm  of  both  sides  of  Eq.  (31): 

L(xs,  d)  =  dA%i\  -  XS]]}  -  A [Xi  -  xSj]  -  ln{d; [£*]•'} 

j= 1 i= 1 

+  In  {(2tt)-j}  +  In  {|R|-1/2}  -  (38) 

3.3  Deriving  the  score  function 

The  ML  estimate  of  the  shift  parameter  vector  is  the  root  of  the  score  function.  The 
score  function  is  calculated  by  taking  the  gradient  of  the  log-likelihood  function  with  respect 
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to  the  shift  parameters 


Q 

s(*s,d)  = —L(yiS,d).  (39) 

Substituting  Eq.  (38)  into  Eq.  (39)  and  evaluating  the  derivative  yields 

s(xs,d)  =  jsZSdjPi]  ln{A[fj  -  -  A[fj  - 

-5^{5x?r"1xs}'  <40) 


Next,  we  rewrite  the  set  of  terms  in  the  double  summation  in  Eq.  (40).  The  first  step 
involves  swapping  the  order  of  the  summations,  so  that  the  inner  sum  is  now  over  the  ith  pixel 
in  all  J  subapertures  versus  summing  the  I  pixels  in  each  subaperture.  Next,  the  sum  over  j 
is  expanded  and  the  terms  are  collected  into  vectors  in  Eq.  (41).  The  double  sum  of  Eq.  (40) 
is  thus  written  as 


Y  Y  dA*i\  ln{A[£i  -  -  A  [Si  -  xSj] 

j— 1  i= 1 

=  YY  dj[*i)  ln{A[£i  -  £<?,]}  -  A  -  xS)] 

j= 1 
I 

=  Y dA*i]  ln{A[^  -  £$,]}  -  A [xi  -  £5l]  +  •  •  • 

i=l 

+  dj[xi ]  ln{A[®i  -  f5j]}  -  A  [xi-  xSj) 

=  Y  ln  {A  [xilT  -  x|] }  d[®i]  -  A  \xilT  -  x£]  1 

2  —  1 

where 


1  = 

1 

J* 

M 

II 

Xi 

1 

Xi 

(41) 


(42) 
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A[a?jl  —  xs]  and  In  {Afx^l  —  x^]}  are  short-hand  notations  for 


A[xjl  -  xg  = 


and 


ln{A[fjl  -x5]}  = 


X[xi-xSl] 

Hxi  -  $Sj] 

ln{A[*i  -x5l]} 
ln{A[xi  -  £<?,]} 


(43) 


(44) 


respectively;  d[^]  is  a  column  vector  representing  the  data  in  the  ith  pixel  location  of  each 
subimage, 

di[xi] 


dfc]  = 


dj 


(45) 


and  Xg  is  the  shift  parameter  vector.  Also,  note  that  the  following  notation  applies: 


and 


A  \xilT  -  x£  =  [Afcl  -  xs]]J 


In  |A  [xjlT  —  x|j  |  =  ln{A[xjl  -  x5]}]T. 


(46) 


(47) 


Substituting  the  result  of  Eq.  (41)  into  Eq.  (40)  yields  the  following  simplification  for  the  score 
function: 

s(x5,d)  =  ^  jx>{A[®ilT-x|]}  d[^]-A[xilr-x^]l| 

{^x5R_1Xs},  (48) 


d  (1 
dXg 
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where  the  score  function  is  seen  to  be  dependent  on  the  shift  parameter  vector  x$.  To  estimate 
the  shift  parameters,  set  the  score  function  equal  the  zero  vector,  as  in  Eq.  (5),  and  then  apply 
the  gradient  operator  to  the  individual  terms  in  the  summation  yielding 


E  ln  {A -  x?] }  d[^]-^-A[^lr-x| 

i=  1  ^ 


d_ 

dxs' 


1  - 


e  n 

dxs 


{-x|R  xx5}  =  0. 

(49) 


The  gradient  of  the  natural  logarithm  of  A[-]  can  be  simplified  in  the  following  manner 


[16]: 


_d_ 

dxc 


ln{A[fjl  —  x^]}  = 


8 


8xSl 

d 


ln{A[a?j  —  ^sj} 


dx 


Sj 


ln{A[z;  -  xSl}} 


8 


dx 


Si 


ln{A[x;  -x5j]} 


d 


dx 


Sj 


ln{A[x,  -  x5j]} 


•  (50) 


Evaluating  the  partial  derivatives  of  Eq.  (50)  yields 


d 

dxs 


ln{A[xjl  -  x<?]} 


d\[xj  -  xSl 


A[xi  -  x5l]  dxSl 


d 

dxs 


A[x4  -  xs]  Ad\ 


1  dA[x»  - 
A[xi  -  xSj]  dxSj 


(51) 


where  A^,1  is  a  diagonal  matrix 


1 


A[Xi  2?5i, 


\[Xi  -  XSj]  J 


(52) 
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Q 

and  — — X[xil  —  x$]  is  the 
0x5 


gradient  of  a  vector  (which  is  also  a  diagonal  matrix) 


8 

dxs 


A[f<l  -xs] 


d 

dxSl 


A[xj  -  xSl 


8 


dx 


Si 


-A[£;  -  xSj 


4zxlii~3s‘] 

a|-A[Si  -  Ss'l 


d 

dxSj 


A  [xi  -  xSj ' 


0 


0 


d 

dxSj 


A[£i  -  xSj[ 


(53) 


The  third  term  of  Eq.  (49)  depends  on  the  shift  vector  xs  and  the  correlation  matrix 
R.  This  third  term  can  be  reduced  using  basic  linear  algebra.  The  term  x^R^x^  is  in  pure 
quadratic  form  [45].  Given  that  xs  €  H2J,  R-1  €  ft2Jx2-7,  and  [R_1]T  =  R-1,  the  gradient 
of  x^R”  _1X5  with  respect  to  the  vector  xs  is  [16] 


a 

dxs 


x^R  XX5  =  2  R  1 


X5- 


(54) 


Combining  the  results  of  Eqs.  (51)  and  (54),  Eq.  (49)  can  be  expressed  as 


1 


E 


d 

dxs 


T 


(A^dfxi]  -  1)  -  R  1xs  -  0. 


(55) 


The  score  function  expression  in  Eq.  (55)  represents  the  gradient  of  the  log-likelihood 
function  of  the  WFS  image  set  equal  to  zero.  Solving  Eq.  (55)  for  xs  yields  the  ML  estimate 
of  the  shift  parameters.  Additional  progress  cannot  be  made  without  assuming  a  specific  form 
for  the  rate  function.  In  the  following  section,  a  form  for  the  rate  function  is  assumed  and  the 
derivation  is  continued. 
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3.4  Solving  for  the  shift  parameter  vector  X5 


In  solving  for  the  shift  parameter  vector,  the  generality  of  the  subaperture  shape  is  set 
aside  to  make  the  mathematics  tractable.  In  the  following  derivation,  the  intensity  profile  of 
the  spot  in  each  subimage  is  assumed  to  be  a  Gaussian  function.  The  main  goal  in  modeling  the 
intensity  distribution  is  to  accurately  estimate  the  main  lobe  corresponding  to  the  centroid  of 
the  spot.  A  Gaussian  intensity  profile  scaled  by  the  average  number  of  photo-events  detected 
in  the  subimage  is 


X\xi  -  xSj]  = 


KjP 
2tt  a2 


exp 


(56) 


where  lx  1 2 


—  XX  — 


x2  +  y2  and  the  average  photon  count  in  the  jth  subimage  is 


(57) 


adequately  models  the  intensity  distribution  in  the  subaperture.  The  rate  function  in  Eq.  (56) 
is  patterned  after  Winick’s  spot  intensity  distribution  [52].  The  rate  function  is  modeled  by  a 
Gaussian  distribution  scaled  by  the  average  photon  count  Kj  and  the  pixel  area  l2,  the  mean 
is  represented  by  the  shift  parameter  xsp  and  the  spot  size  is  controlled  by  the  variance  a2. 
The  variance  a2  is  dependent  on  the  average  imaging  wave  length  Xavg,  the  focal  length  of 
the  lenslets  /;  [52],  and  the  dimension  of  the  subaperture  L.  This  Gaussian  function  simplifies 
the  analytical  solution  for  the  shift  parameters. 

Although  scintillation  effects  on  the  wave  front  are  overshadowed  by  the  phase  pertur¬ 
bations  [7],  the  rate  function  in  Eq.  (56)  is  scaled  by  the  photon  count  in  each  subimage,  Kj,  to 
allow  for  scintillation  effects  from  subaperture  to  subaperture.  However,  scintillation  across 
each  subaperture  is  assumed  to  be  negligible.  In  other  words,  the  wave  front  distortion  across 
each  subaperture  will  be  modeled  as  a  random  tilting  of  the  plane-wave  front. 

The  analytical  solution  of  Eq.  (55)  for  the  rate  function  defined  in  Eq.  (56)  depends  on 
the  following  assumptions:  the  spot  intensity  distribution  must  be  sufficiently  compact  so  that 
light  passing  through  lenslet  j  does  not  bleed  over  into  the  detector  elements  of  a  neighboring 
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subaperture  j'  and  large  enough  so  that  more  than  one  pixel  reports  a  photo  event,  hence  the 
following  approximation: 


i 


i= 1 


(58) 


for  all  j.  Figure  (5)  displays  compact  spots  overlayed  onto  the  detector  pixel  array.  The 
derivative  of  the  Gaussian  rate  function  defined  in  Eq.  (56)  with  respect  to  the  jth  shift  yields 
a  diagonal  matrix  with  the  jth  element  given  as 


d 


dx< 


\r  zt  zt  1 3  v  * 

-A[Xi  XS}]  -  27rcT4 


Kjl  ( %Sj)  \  %Sj 

_ _ _ ; _  ovrv  J _ L 


exp 


(59) 


Substituting  the  results  of  Eq.  (59)  into  Eq.  (53)  enables  the  summand  of  Eq.  (55)  to  be 
expressed  as 


dxf 


A  [xilT  -  xf  ]  (A^d^]  -  1) 


eXP|  2a2  } 
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Simplifying  Eq.  (60)  gives 


dxs 


A[xilT  -  xf]  (ApMfxi]  -  1) 


Xi  -  xSl 


di[xi 1  - 


K^ixi-xs,) 


2ira% 


exp 


Vi  -  *Sj  J  ,  Kjfjxj  -  xsj )  )  \Xj  -  xsj 

a2  WJ  2X0}  P\  2a* 


f  |®i  -  xSl\2\ 

l  20-s  I 

f  l3-fr,n 

l  K  jJ 


(61) 


A  sum  of  vectors  is  equivalently  stated  as 


N 

EZ"  = 


71  =  1 


N 

Ez*,i 

71=1 


N 

n=l 


where  zn  €  71 


jit 


(62) 


The  equivalence  expressed  in  Eq.  (62)  shows  how  the  summation  operation  in  the  score 
function  of  Eq.  (55)  can  be  moved  inside  the  column  vector  to  sum  the  elements.  The  score 
function  relation  in  Eq.  (55)  can  be  expressed  in  the  following  matrix  notation: 


1  l2(xi-xSl) 

— a— <M*i]  -  E - 


E 

2  =  1 


Z=1 


2tt  o} 


exp 


\%i  XSi\ 

2(Jp 


-  R  1  x,?  =  0. 


I 

E 

i=l 


Xi 


p±d, [Si]  -  jz 


2  =  1 


2  7TO-4 


exp 


(63) 

Using  the  approximation  stated  in  Eq.  (58),  the  second  summation  in  each  element  of 
Eq.  (63)  equates  to  zero.  By  replacing  the  summation  notation  with  the  inner  product  of  two 
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vectors,  Eq.  (63)  can  be  written  as 


df(x-®Sll) 


R  1  x5  =  0, 


where 


dj  = 


dj[x  i] 
dj  [®r] 


represents  the  detected  photo-events  (data)  for  the  jth  subimage. 


x  = 


X\ 


Xl 


(64) 


(65) 


(66) 


is  the  pixel  position  vector,  and  1  is  a  column  vector  of  ones.  The  data  is  grouped  in  two 
manners:  d j  denotes  the  data  vector  for  all  I  pixels  in  the  jth  subimage  and  d[a?j]  denotes  the 
data  vector  for  the  ith  pixel  in  all  J  subimages. 

Simplifying  Eq.  (64)  by  separating  the  data  and  positioning  components  from  the  shift 
parameters  and  multiplying  through  by  a2  yields 


dfx 

_ 

dfl 

0 

xs  -  a2v  R  1  x5  =  0. 

(67) 

i 

X 

0 

djl 

The  dj  1  terms  on  the  diagonal  is  another  way  of  expressing  the  total  number  of  photons 
detected  in  each  subimage.  Grouping  the  matrices  multiplied  by  xs  yields 
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0 


+  <7p  R  1  x5  =  0 


or  in  matrix  form 


Cxc  =  m. 


The  C  (which  acts  as  a  Correction  factor)  and  m  (which  resembles  a  moment  calculation)  are 
defined  as 


+  CFp  R-1  =  K  +  R" 


with  the  photon  count  matrix  defined  as 


Finally,  the  ML  estimate  Xs  of  the  shift  parameter  vector  is  obtained  by  solving  Eq.  (69)  for 
x$.  If  the  correction  factor  matrix,  C  is  nonsingular,  then  the  solution  is 


x5  =  C^m. 
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If  C  is  singular,  then  calculate  using  the  least  mean  squares  technique  [45]: 


(74) 


where  (Crc)  CT  is  known 


as  a  pseudo  inverse. 


Additionally,  by  the  invariance  property  of  the  ML  estimate  of  the  shift  parameter  vector 
X5,  we  can  compute  the  ML  estimate  of  the  wave  front  slope  over  each  subaperture.  The 
slope  vector  a,  is  defined  as 

U1 


a  = 


(75) 


a.j 


The  approximate  slope  for  a  single  aperture,  a),  is  proportional  to  the  shift,  xs} ,  or  offset  of 

2tt 

the  spot  in  the  subimage  by  a)  =  - - T^Sp  where  Xavg  is  the  average  imaging  wave  length 

* avgfl 

and  fi  is  the  lens  focal  length.  Thus,  the  ML  wave  front  slope  estimate  across  the  telescope 
pupil  is 


a  = 


2-7T 

Xavg  f l 


x<?. 


(76) 


3.5  Conclusions 

In  this  chapter,  the  ML  technique  was  employed  to  estimate  the  wave  front  slope  using 
the  PDF  constructed  in  Chapter  II  to  optimally  determine  the  wave  front  slope  in  the  pupil  of 
the  telescope. 
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IV.  Analysis  &  discussion 


4.1  Introduction 

Preceding  chapters  modeled  the  data  recorded  by  the  Hartmann-type  wave  front  sensor 
(H-WFS)  with  a  probability  density  function  (PDF)  and  then  estimated  the  shift  parameter 
vector  using  maximum  likelihood  (ML)  estimation.  This  chapter  explores  a  few  limiting  cases 
for  the  shift  estimator  and  derives  expressions  for  the  first  moment  and  the  mean  squared  error 
(MSE)  of  the  estimator.  The  ML  shift  estimator  is  shown  to  be  unbiased.  The  expression  for 
MSE  of  the  estimator  is  shown  to  depend  on  the  light  level,  the  correlation  properties  of  the 
wave  front  slope  statistics,  and  the  classical  centroid  shift  calculation. 

With  slight  loss  of  generality,  the  light  level  in  each  subaperture  is  replaced  by  the 
average  photon  count  Kj,  defined  as 


Kj  =  E{Kj}, 


av 


where  Kj  =  dj  1.  Subsequently,  K  is  redefine  as  the  average  photon  count  matrix: 


K  = 


0 


(78) 


Note  that  K  is  now  deterministic.  The  shift  parameter  estimate  given  in  Eq.  (73)  is  repeated 
here  for  convenience: 

xs  =  C-1m,  (79) 

where 

C  =  K  +  <jp  R-1  (80) 
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is  assumed  to  be  nonsingular.  Using  the  invariance  property  of  the  ML  estimate,  the  ML 
estimated  slope,  a,  is 

a  =  t — j  xs,  (81) 

where  Xavg  is  the  average  imaging  wavelength  and  //  is  the  focal  length  of  the  lenslets.  Since 
the  slope  and  shifts  are  related  by  a  constant,  the  properties  of  the  slope  estimator  will  be 
explored  using  the  shift  parameter  estimate,  Xj.  In  the  next  section,  the  shift  vector  estimate 
is  shown  to  be  a  function  of  the  classical  centroid  calculation. 

4.2  The  classical  centroid  method 

The  classical  centroid  method  of  calculating  the  spot  centroid  is  adequate  for  high 
light.  Several  researchers  have  documented  the  performance  of  the  technique  by  modeling  the 
optical  detection  process  with  the  Poisson  point  process  [3,  24,  37].  However,  for  techniques 
used  in  wave  front  reconstruction,  a  more  accurate  technique  may  be  necessary  for  low  light 
level  imaging.  Scintillation  effects  from  subaperture  to  subaperture  are  accounted  for  in  the 
model  since  Kj  is  not  necessarily  equal  to  Kj>. 

Consider  writing  Eq.  (79)  as 

x5  =  C_1KK-1m  =  C^m,  (82) 

where  m  is  equivalent  to  the  classical  centroid  calculation  for  the  shift  vector  [24,  37].  The 
classical  centroid  offset  calculation  is 

rh\ 

% 

:  =  K-1m,  (83) 

mj 

Kj  . 


dfx 

dfl 

dTx 

.  dfT 
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Thus,  the  ML  estimate  its  is  a  function  of  the  classical  centroid  shift.  From  Eq.  (82),  the 
inverse  of  the  modified  correction  factor  matrix  is  defined  as 


CT1  =  C^K,  (84) 

which  implies 

C  =  (C^K)-1  =  K_1C.  (85) 

Substituting  Eq.  (70)  into  Eq.  (85)  yields  the  final  form  of  the  modified  correction  factor 
matrix: 

C  =  I  +  <Tp  K-1R_1.  (86) 

Harnessing  the  power  of  the  Sherman-Morrison- Woodbury  formula  [12],  the  inverse  of  C_1 
becomes 

CT1  =  I-1  —  I_1(orp  K-1)  [i  +  R-1I_1((7p  K-1)] _1  R-1I-1, 

=  I  —  K-1  (i  +  R_1K_1)-1  R_1, 

=  I-^I+^RK^  .  (87) 

Substituting  Eq.  (87)  into  Eq.  (82)  achieves  the  following  relation  for  the  ML  shift  estimate: 

I  +  ^RK) 

Hence,  the  ML  estimator  for  the  shift  vector  Xs  is  equal  to  the  difference  of  the  classical 
centroid  shift  calculation  m  and  a  correction  term  based  on  the  relative  spot  size  variance  a 
the  light  level  K  and  spot  shift  correlation  properties  R.  Figure  (6)  illustrates  how  the  classical 
centroid  shift  for  the  jth  subimage,  rhj,  represented  by  the  thick  dashed  line,  is  modified  by 
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the  classical  centroid  shifts  in  the  other  J  —  1  subimages.  The  bjj>  coefficients  are  elements  of 


B  = 


(89) 


which  is  the  second  term  in  Eq.  (88). 

In  the  following  subsections,  Eq.  (88)  is  simplified  for  high  or  low  light  levels  and  when 
the  wave  front  slopes  are  uncorrelated. 


4.2. 1  Bright  objects.  When  imaging  bright  objects,  Eq.  (88)  reduces  to  the  classical 
centroiding  calculation,  implying  that  the  correlation  between  the  slopes  over  each  subaperture 
are  not  important.  For  high  light  levels,  K  is  a  diagonal  matrix  of  large  numbers.  Thus 


which  reduces  Eq.  (88)  to 


x<?  = 


(90) 


When  imaging  infinitely  bright  objects,  xs  =  m  which  implies  that  Eq.  (90)  is  exact.  Next, 
the  behavior  of  the  estimator  is  explored  for  dim  objects. 


4.2.2  Dim  objects.  When  imaging  dim  objects,  K  is  a  diagonal  matrix  of  small 
numbers  and  thus  Eq.  (88)  reduces  to  xs  =  0.  This  “no  shift”  estimate  indicates  that  the 
photon  starved  WFS  image  cannot  produce  a  reliable  measurement.  To  demonstrate  this 
result,  let  K  be  a  diagonal  matrix  of  small  numbers  so  that 
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Figure  6.  Vector  representation  of  the  jth  ML  shift  estimate  given  in  Eq.  (88).  The  tip- 
to-tail  vector  addition  illustrates  how  the  classical  centroid  calculation,  fhj,  is 
modified  by  incorporating  the  centroid  shift  correlation  statistics  and  light  levels 
to  achieve  the  most  likely  value  for  the  centroid,  xsr  The  vectors  —bjjirhj  for 
j'  =  {1, . . . ,  J}  represent  the  additional  knowledge  provided  by  the  ML  theoretic 
approach.  The  thin  dashed  vectors  represent  the  other  J  —  4  unnamed  terms 
modifying  the  classical  centroid  shift. 
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a 


Figure  7.  Plot  of  the  factor  in  Eq.  (92) 


and  thus  Eq.  (88)  becomes 


x<?  = 


m  «  (I  —  I)  m  =  0. 


(91) 


The  following  case  considers  the  special  situation  for  uncorrelated  wave  front  slopes. 


4.2.3  Uncorrelated  wave  front  slopes.  When  the  wave  front  slopes  are  uncorrelated, 
R  becomes  a  diagonal  matrix.  Thus  all  of  the  matrices  in  Eq.  (88)  are  diagonal.  Hence  the 
jth  diagonal  block  element  is  given  by 


rhj 


(92) 


where  <rj  is  the  x  or  y  coordinate  shift  variance  for  the  jth  subimage,  Kj  is  the  average 

photon  count  for  the  jth  subimage,  and  the  factor  1 - - —  is  plotted  in  Fig.  (7),  with 

1  +  at 

a  =  Kjtf/<Tp.  Note  that  for  high  and  low  light  levels,  Eq.  (92)  agrees  with  the  results  of  the 
previous  subsections. 
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In  the  following  section,  the  bias  of  the  shift  vector  estimate  is  calculated. 


4.3  Is  the  maximum  likelihood  shift  vector  estimate  unbiased? 

To  demonstrate  whether  or  not  the  ML  shift  vector  estimate,  x?,isan  unbiased  estimator, 
the  first  moment  of  x5  is  calculated  and  is  compared  to  the  expected  value  of  the  shift  vector, 
X5.  If  f£{xs}  =  E{xs},  then  xs  is  an  unbiased  estimator  of  x5.  The  first  moment  of  the 
shift  vector  estimator  is 


E{xs}  =  CT^m}  =  C^K-^m}, 


(93) 


where  f?{m}  can  be  expressed  as 


£{m}  =  E 


r 

'  I 

^  'Xjd\  [% i ] 

\ 

2—1 

I 

>  - 

L 

Y^Xidj[xi\ 

-  2  —  1 

4 

I 

z=  1 

I 

Y^XiE{dj[xi]} 

i— 1 


(94) 


In  Eq.  (94)  the  expected  value  of  dj  [x,]  can  be  written  as  the  following  nested  conditional 
expectation  relation: 


E{dj[xi]}  —  ExSj  >  (95) 

where  the  inner  expectation  can  be  evaluated  using  Eq.  (26).  Hence  the  inner  expectation 
equates  the  expected  value  of  the  data  and  the  rate  function  yielding 

ED[xi] \xs.  K  N }  =  ~  xs3]  ■  (96) 
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Simplifying  Eq.  (95)  with  Eq.  (96)  yields 


£{<%[%]  >  =  %  .{*!$-*»,]}. 


and  by  definition, 


Exs.  {^[®*  ®5,-]}  —  f  ^lxi  ^Sj]  fxs.{%Sj)dxgj 

3  J  Xs  •  3 


The  rate  function  was  defined  in  Eq.  (56)  as  a  scaled  Gaussian  function 


\f-»  -*  i  Kjl  f  i  G'Sj 

AN-^]  =  ^exp| - ^ 


Equivalent  forms  for  the  magnitude  squared  of  a  2-D  vector  x  are  \x\2  =  x2  +  y2  =  xTx. 


The  PDF  is 


/^(fs')=2^fHCp{-!l f}’ 


where  a2  is  the  variance  of  the  jth  random  shift.  Substituting  Eqs.(99)  and  (100)  into  Eq.  (98) 


yields 


It  MexpJ  MJ!|  i  cxpf 

JxSjex  2-k<t%  P  {  2 J  2% a2  \  2  a]  J 

f  Kjl 2  f  \\xA2  2 xfxSi  |x5.|2  lx*.,  ,  ,  _ 

Ls.ex  4ir2aja]eXP{  [2  a]  2 a]  +  2 a2  +  2a]  \j  dXSj'  (101) 


In  order  to  evaluate  Eq.  (101),  the  “complete  the  square”  operation  is  performed  on  the 
exponent.  For  notational  convenience,  let  fXj  =  ,  thus  the  exponent  of  Eq.  (101) 


p  — j 
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becomes 


^j2  2 xjxs,  |^12  ,  |%12 

'  o  ■ 


K 


2°\ 


2c]  2c] 


=  ft  I  1**1 

=  H  fks, 


7TT~ + 

-1  |2  / 


1 


Ei 


2<^ 


X, 


2oJ^,- 

1  1 


»i  4cT>; 


X; 


(102) 


where  the  square  was  completed  in  Eq.  (102).  Combining  Eqs.  (101)  and  (102)  yields 


EXs  {Aft  -  2*]} 


i*2c]c]  eXP  {2cr|  (1  2c][ij 


\*Ei 


'  Xs- 


[ 


exp  <  -ixj 


xSj~ 


■x. 


dXo. 


Kjfi 

Air/i 


=»//** 


(103) 


Xi\2\  have  been  factored  out  of  the  integral 


7T 


where  the  constants  and  exp 

(  1  21 

and  then  the  integral  of  exp  <  —fXj  Ss3  —  -—5 — Xi  >  i 

\  2(7p/^i  J 

resembles  the  product  of  two  Gaussian  PDFs  [28].  Finally,  combining  Eqs.  (94),  (97),  and 
(103)  yields  the  following  expression: 


integrates  to  —  because  the  integrand 

m 


E'-Jm}  = 


K£  ^  -  /  —1  ( .  1 

Annual  §  *'  6XP  {  2cr2  {  2 a^i , 


Xi 


Kjl <2  * 

4  WM  S 


Xi  exp 


{2a2  (X  2cTplXj ) 


(104) 


Each  of  the  J  identical  elements  equates  to  zero  because  x3  exp  |  — ^  ( 1 - ^ —  ]  |  Xj  | 2 1 

»=i  l2crp  V  2<VW  J 

approximates  the  expected  value  of  a  zero  mean  bivariate  Gaussian  random  variable.  This 
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approximation  was  previously  stated  in  Eq.  (58)  for  a  centered  Gaussian  random  variable. 
Hence,  E{m}  =  0  implies  E{x^}  =  C-1K-10  and  thus 

E{xs}  =  E{xs}  =  0.  (105) 

Therefore,  its  is  an  unbiased  estimator  since  E{ks}  and  E{x5}  are  equal.  Another  important 
metric  used  to  quantify  the  performance  of  an  estimator  is  the  MSE.  The  MSE  for  the  ML 
shift  vector  estimator,  x$,  is  calculated  in  the  next  section. 

4.4  Mean  squared  error  of  the  maximum  likelihood  shift  vector  estimate 

The  performance  of  the  estimator  is  quantified  by  the  MSE  metric.  The  derivation  starts 
with  the  definition  of  the  MSE. 

4.4. 1  Start  with  the  definition.  By  substituting  the  error  covariance  matrix  of  Eq.  (9) 

into  the  MSE  given  in  Eq.  (10)  for  the  ML  shift  vector,  the  MSE  is  written  as 

MSE(xs)  =  tx(E{[xs-E(xs)][xs-E(xs)]T}) 

=  tr(£{[x5][x5]T}),  (106) 

since  Xs  is  an  unbiased  estimator.  Substituting  the  expression  for  the  ML  estimator  from 
Eq.  (82)  into  Eq.  (106)  yields 

MSE(xs)  =  tr  [E  {[C-1m][C-1m]T}) 

=  tr  {c~xE  {mmT}  C-1) 

=  tr  (C^MCT1)  ,  (107) 

where  M  —  E  jmmTj  is  called  the  moment  covariance  matrix. 
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The  MSE  relation  in  Eq.  (107)  can  be  written  as  a  sum  of  terms  to  show  how  knowledge 
of  the  light  level  and  shift  correlation  properties  influence  the  MSE  of  the  ML  shift  estimator. 
Using  Eq.  (87)  to  simplify  Eq.  (107)  yields 


Thus,  the  MSE  of  the  shift  vector  estimate  is  equal  to  the  sum  of  the  classical  centroid  MSE 
calculation,  tr  (M),  and  two  other  terms  which  incorporate  the  light  level,  K,  the  spot  size,  <r 
and  the  shift  correlation  statistics,  R.  By  incorporating  the  knowledge  of  K,  o and  R,  the 
MSE  of  the  estimator,  MSE(x5),  is  reduced.  Additionally  the  MSE  of  the  classical  centroid 
calculation  forms  an  upper  bound  on  the  MSE  of  the  ML  shift  estimate. 

Note  that  for  very  high  levels,  the  MSE  goes  to  zero!  In  other  words,  when  there 
is  an  abundance  of  photo-events,  the  estimator  is  “exact”  in  the  mean-squared  sense.  In 
the  following  subsections,  the  moment  covariance  matrix  M  is  determined  and  the  result  is 
combined  with  Eq.  (108)  to  compute  the  MSE  of  the  shift  parameter  estimator,  x.$. 


4.4.2  Calculating  the  moment  covariance  matrix.  The  calculation  of  the  moment 
covariance  in  Eq.  (108)  has  been  divided  into  several  steps  to  facilitate  the  flow  of  the 
derivation. 


Step  1.  Calculation  of  the  moment  covariance  matrix  elements,  M,  is 
simplified  by  using  m  =  K-1m  and  then  moving  the  expectation  operator  into  the  matrix, 
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thus 


M  =  E  {(K-1m)(K-1m)T} 
=  K-1E{mmT}K-1 


E^mifhi}  • 

E  {m/mf}  • 

••  E{mjmJ } 

where  E  {fhjmj,}  is  a  2  x  2  matrix. 

Step  2.  Now,  evaluate  E  | rhjmj,  }  for  all  j  and  j'  given 

I 

i= 1 


(109) 


(110) 


Computing  the  elements  of  M  in  Eq.  (109)  involves  evaluating  E  {rhjfhji},  which  can  be 
expressed  as 


=  X)  X)  [£»']}•  (111) 

i=i  »'=i 

Further  evaluation  of  Eq.  (Ill)  is  performed  in  two  separate  calculations,  Case  I  for 
j  =  j1  and  Case  II  for  j  ^  j1.  Case  I  is  related  to  the  diagonal  elements  of  the  moment 
covariance  matrix  and  Case  II  is  related  to  the  off-diagonal  elements  of  the  moment  covariance 
matrix. 
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Case  I.  When  j  =  j',  Eq.  (1 1 1)  becomes 


i  i 


E{rhjfhJ)  =^^xixjE{dj[xi]dj[xil}}  +  '£xix?'E  {{djlxi})2}  .  (112) 


i= 1  i’=l 

i^i1 


i— 1 


Using  Eq.  (95),  E{dj[xi]}  can  be  written  as  E$s  {#Dp.]|*s.  {^[^]}|,  and  the  expectation 
over  the  random  shift  vector  can  be  factored  out  of  Eq.  (112)  and  written  as 

[  i  i 

E  yrijrh-  j  =  E%s  <  X ^(D\Si), £>[*,-, ])|x5 .  [®*H-  [xi>] } 


k  2=1  i> 


+ 


'^2,XixJ Ed [S.]|xs^  {(dj[x8])  }  j. 


(113) 


Consider  the  following  facts:  and  Em][jtsj  {(d,-[^])2}  are 

expectations  on  Poisson  random  variables,  and  the  variance  is  equivalent  to  the  first  moment 
of  a  Poisson  random  variable  [28].  Employing  these  facts,  Eq.  (113)  can  be  simplified  as 

E  {fhjrhj  |  =  ^xSj  ^i^vEDfi.]\jts.{dj[xi]}ED^ii]\j£S'{dj[xii]} 

+  X  ED[S.}\jts. {dj [xi] } |  • 

*=i  3  ) 


(114) 


Substituting  the  rate  function  of  Eq.  (96)  into  Eq.  (114)  and  moving  the  expectation  over  the 
shift  vector  back  into  the  summations  yields 


i  i 


Efarnf}  =  ££  Xi xv  E-£s  { A  [xi  x$j ]  A  [xi  — 

i=i  i>= i  3 

+  X XiXi  E%s  {A[a:i  —  (115) 


2=1 


Next,  the  expected  value  of  the  rate  functions  in  the  above  equation  are  evaluated.  The 
first  moment  of  the  rate  function  in  the  single  sum  was  computed  in  an  earlier  derivation  of 
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the  first  moment  of  the  shift  estimator.  Substituting  Hj  =  — ^  into  Eq.  (103)  yields 


%  "  *"]}  “  *4$+o})  eXP{-2^fVsF)'^12}  • 


(116) 


The  expected  value  of  X[x{  —  xs,]  A[®i»  —  xsj  in  the  double  sum  of  Eq.  (1 15)  is  derived 

— * 

next.  Both  rate  functions  are  a  function  of  Xsp  so  the  definition  of  the  expected  value  of  a 
function  of  a  single  random  variable  is  used  to  write  [28]: 

Exs  {-M^  -  xSj]X[xi'  -  £$.]}  =  [  X [xi  -  S3j]><[^  ~  xSj]  fx(xSj)dxSr  (117) 

3  J  JXSj£X  3 

Evaluating  Eq.  (1 17)  with  Eqs.  (99)  and  (100)  yields 


Exs  {-M*i  _  *»*]*[&  -  *s,]} 


aft*  t  {  rw+i* 


m+br** .  i*v  .  i^m 

+— +'2^-J)dIs- 


(118) 


1  1 


For  notational  convenience,  let  Vj  =  —  +  ^ ,  and  then  complete  the  square  on  the  exponent 

in  Eq.  (1 18)  in  the  following  manner: 

\xi\2  +  \xi<\2  2(xi  +  xv)TxSj  \xSj\2  \xSj\2 

i  0  "r 


=  Vj 


=  Vj 


2a] 


I3*/  “  2 +  +  2^  +  ^'|2) 


2a2pVj 


XSi  2a2Vj ^  +  X^  (afa)  ^  +  X^2  +  2a2pUj  ^  + 


(119) 
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Substituting  Eq.  (119)  into  Eq.  (118)  yields 


Ejis  {A[x,  -  $S]Wi>  ~  £*,]} 

(4 

^Xs  ■ 


KW 

8 *WpafX\„p 


xA 2  +  \xiA 2  - 


o  \Xi  “I"  Xji 

°lVj 


ISSjZX 


exp  <  —Vj 


XS 3 


(*Ei  +  ) 


! 


dxSl 


K)l 4 


8X2  VjCT^a2 


exp 


=*/Vj 


m*+\w- 


2  P'2  H-  %i( 

<tyi 


(120) 


where  the  constants  and  exp 


{2^2 


ST  +  * "  - 


•^2  +  i‘ 


i) 


have  been  factored 


out  of  the  integral  and  then  the  integral  of  exp  |  —  Vj 

7T 


XSi 


2a2Uj 


(*^7  “I”  X{t^ 


J 


integrates  to 


vi 


since  the  integrand  resembles  a  product  of  two  Gaussian  PDFs  [28].  Finally,  substituting 


Vj  =  ^  into  Eq.  (120)  yields 

oi  2<t, 


ExSj  {A[®i  -  -  £<?.]} 

^2/4  r  -j  r  j  2  1  "j 

=  4xV|(4  +  2.|)eXP  {"Sf  l1*!*  +  |fi'12  “  4  +  2;  f>  +  ^2\  }  •  <121> 


Equation  (121)  represents  the  covariance  of  two  pixels  in  a  single  subimage  for  the  doubly 
stochastic  Poisson  random  variable.  Substituting  Eqs.  (1 16)  and  (121)  into  Eq.  (1 15)  yields 


E{rhjmj) 

I  I 

=  EE 

i=l  i’=l 
I 


X<i*C£t 


T  w 

4jrV£(<r‘;  +  2a j)  eXp  |  2 al 


{  H  . 


r*  2  »  t*  2 

1  Xjf 


4  ai 


«? + 2ff? 


P'2  ”f” 


+  ^  ^  XjX. 


i*^i 


2=1 


27r( 


f^)exp{-2(5fT^)'s‘'2}- 


(122) 
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Now  that  the  covariance  elements  for  j  =  j'  have  been  evaluated,  the  off-diagonal 
elements,  j  ^  j'  which  represent  the  covariance  of  the  pixels  for  different  subimages  is 
undertaken. 


Case  II.  The  expected  value  of  dj[xi\dji[xii]  is  the  covariance 
of  two  pixels  in  different  subimages.  Equation  (114)  is  actually  a  shorthand  notation  for  the 
following  conditional  expectation: 


E  {dj[xi]djf[xi>]}  —  •  (123) 


Since  dj[xi\  and  dji[xii]  are  conditionally  independent  Poisson  random  variables,  Eq.  (123) 
can  be  written  as 


E  {dj[xi\dji[xi']}  —  E 'xs.'Xs.,  ed[x)\xs., 

=  Exs.,xs.,  {A[^  “  -  2s,]}  ,  (124) 

where  the  first  moment  of  the  data  is  equal  to  the  rate  function  as  shown  in  Eq.  (96).  The 
cross-correlation  of  the  two  functions  of  different  random  variables  in  Eq.  (124)  is  given  by 
[28] 


EXSj  ,XSjl  {'M®*  ^£y]} 

=  fj.  Ifc-fyWZr-ZsfVujt,  JZsj.Zs 025) 

JxSj  Jxsjf J  r 

where  the  joint  PDF  fxs  ,xs  is  determined  with  the  aid  of  Eqs.  (33),  (34),  and 

(35)  for  J  =  2  subapertures.  Thus,  the  joint  PDF  for  the  two  correlated  shift  vectors 

fxSpxSjl  =  (2»)~2|R|“1/2exp  {-^xfR-1^}  ,  (126) 
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where 


x<?  = 


and  |R|  is  the  determinant  of  the  correlation  matrix  R  for  two  subapertures.  The  correlation 
matrix  is  a  block  matrix  with  the  following  form: 


P-jj 

P*j'j  Rj'j' 


where  R jj<  was  defined  in  Eq.  (35).  Substituting  Eqs.(99)  and  (126)  into  Eq.  (125)  yields 


EX5],Xs3,  {A[^  -  XS3Wi'  ~  %]} 

x  (27r)_2|R|-1/2  exp  {"X^R^Xsj  dxSjdxs.,.  (129) 


i2  Y  (  i^-^n  f  i^'-%i2 
2^J  exp{  2a}  )exp{  2*1 


Completing  the  square  of  the  exponents  of  Eq.  (129)  is  considerably  detailed,  thus  many 
of  the  intermediate  steps  have  been  omitted  from  the  following  derivation.  Let  P  =  R-1  and 
then  expand  the  exponential  terms  in  Eq.  (129)  as 


Xi-XSj\2  \Xi'-XSl\2  1 

S} - 2o| - 3X*R  x* 


1  \x?xt  2xfxSj  xJ.xSj  xjxt>  2xJxSjl  xJ.,xSj, 

O  O  ■  O  I  o  o  I*  o 


2  L  al  °l  J 

”  2  xSj  xSj^>jj'  xSjt  +  xs^Pjtj  xsj  +  Xg^Pjiji  xSjt^  •  (130) 
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Note  that  uTAv  =  uiai^vi  +  u\ai^2  +  ^202,1^1  +  ^2^2, 2^2-  The  form  of  the  “square” 
we’re  completing  resembles  the  following  scalar: 


-\(xs-n)TQ  1  (xs  - n) 

=  [(hi  ~  -  tij)T  +  (xs.,  -  njl)TQj>j(xSj  -  ftjf 

+(xS]  ~  ni)TQJy(%  -  nr)T  +  (xSjl  -  nr)TQfj,(xSjl  -  ryf] ,  (131) 


where  Q  =  Q  1  is  a  correlation  matrix  and  ii  acts  like  a  mean  vector.  When  Eqs.  (130)  and 
(131)  are  compared,  the  following  relation  results: 

2 aj  2<r%  2  s  S 

=  ~  (x5  -  n)r  Q_1  (x5  -  n)  -  nTQ-1ii  +  ^-xTx  ,  (132) 

z  I  aP  J 

where 

Q"1  =  R-1  +  a~2I,  (133) 

n  =  °p 2  Qx  =  (i  +  ^R"1)  1  x,  (134) 


(135) 
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Now  the  covariance  of  the  pixels  is  found  by  substituting  the  results  of  Eq.  (132)  into 
Eq.  (129)  to  get 


ExSpxs.,  {A&  -  -  %]} 

=  ^^'(^2)  (S^lRl-^exp  |-^xrx|  exp  {^Q^ii} 

*  L*  exp  {-5 (is  -  (is  - ft)} 

3  3 


=(2tt)2|Q|1/2 


KjKj.fi  ( |Q|\1/2  /  1  _T_\  fl -Tr\-i~\ 

=  exp{-2^xx|exp{2nQ  nl- 


(136) 


Equation  (136)  can  be  further  simplified  by  substituting  the  relation  for  n  found  in  Eq.  (134) 
and  combining  the  exponentials.  Hence,  Eq.  (136)  becomes 


ft/ 
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KjK/l*  I 

iqiV/2 

(  1  xTx 

|  exp  | 

47T2(Jp  \ 

m)  p 
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X 

{“2^iT[I“(I  +  ‘7^"1)"1]i}’  U37) 


where  the  equalities  |Q  l\  =  |Q|  and  QT  =  Q  were  used. 

Step  3.  Now  compile  the  results  of  Step  2  to  compute  the  block  elements 

Mif  =  x~K^E  }  ’  (I38) 


of  M 
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defined  in  Eq.  (109).  Substituting  Eq.  (122)  into  Eq.  (138)  yields 


i  i 


=  EE 


l4XiX$ 


+  2a]) 


exp 


—*  2  |  -* 
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4a] 


<4  +  2 


”f”  ^z1 
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l2Xixf 
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exp 


l  2(<tJh 


+  «?) 


(139) 


The  result  of  Case  II  is  used  to  calculate  the  off-diagonal  elements  of  M.  Substituting 
Eq.  (137)  into  Eq.  (138)  yields 


1/2 


m,, = ee  Wt  (n^r)  t1-  *}• 

(140) 


4.4.3  Summary  of  mean  squared  error  calculation.  By  definition,  the  MSE  of  the 
ML  shift  estimate  is  equal  to  the  trace  of  the  shift  estimate  covariance  matrix.  The  ML  shift 
estimate  was  expressed  in  terms  of  the  classical  centroid  shift,  shift  correlation  statistics,  light 
levels,  and  the  spot  size  variance  in  order  to  show  the  relative  dependence  on  these  factors. 
The  elements  of  the  moment  covariance  matrix  were  then  evaluated.  To  determine  the  MSE 
of  the  ML  shift  estimate,  substitute  the  results  for  the  moment  covariance  matrix  calculation 
found  in  Eqs.  (139)  and  (140)  into  Eq.  (108). 


4.5  Conclusions 

In  this  chapter,  the  ML  shift  estimator,  Xs  =  C_1m,  was  expressed  as  a  function  of 
the  classical  centroid  calculation,  m,  and  a  priori  knowledge  of  the  shift  correlation  matrix, 
R,  the  photon  count  matrix,  K,  and  the  spot  size  variance,  a2.  Several  special  cases  were 
explored,  for  instance,  at  high  light  levels,  the  ML  shift  estimate  reduced  to  the  classical 
centroid  calculation.  The  estimator  was  found  to  be  unbiased.  The  MSE  of  the  estimator  was 
then  shown  to  be  upper  bounded  by  the  MSE  of  the  classical  centroid  method,  which  is  simply 
the  trace  of  the  moment  covariance  matrix.  Lastly,  the  elements  of  the  moment  covariance 
matrix  were  evaluated. 
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V.  Conclusion 


5.1  Introduction 

Atmospheric  turbulence  is  the  primary  agent  responsible  for  distorting  astronomical 
imagery  and  the  finite  light  levels  reaching  the  ground-based  imaging  system  serves  to  limit 
the  performance  of  a  wave  front  sensing  technique.  The  blurring  of  the  imagery  can  be 
lessened  by  an  adaptive  optical  (AO)  imaging  system  or  image  post-processing  techniques 
such  as  deconvolution  from  wave  front  sensing  (DWFS).  The  purpose  of  this  thesis  is  to  derive 
a  maximum  likelihood  (ML)  estimate  of  the  wave  front  slope. 

5.2  Summary  of  methodology 

The  methodology  employed  in  this  thesis  centers  around  constructing  a  probability 
density  function  (PDF)  describing  the  data  collected  by  a  Hartmann-type  wave  front  sensor 
(WFS).  The  task  begins  with  the  formation  of  a  single  image.  The  single  image  is  then 
extended  into  a  composite  image  representing  the  WFS  image.  The  joint  PDF  describing 
WFS  is  employed  by  the  ML  estimation  technique  to  solve  for  the  most  likely  wave  front 
slopes  over  each  subaperture  given  the  WFS  image.  The  optical  detection  process  is  modeled 
by  a  conditionally  Poisson  process,  where  the  Poisson  parameter  or  rate  function  is  itself  a 
random  process.  The  form  of  the  rate  function  is  deterministic,  and  the  stochastic  portion  of 
the  rate  function  is  represented  by  the  location  of  the  spot  in  each  subimage  of  the  Hartmann- 
type  WFS  (H-WFS).  Since  the  slope  of  the  wave  front  over  each  subaperture  of  the  H-WFS 
is  linearly  related  to  the  spot  centroid  shift  from  optical  axis  of  the  lenslet,  finding  the  most 
likely  spot  shifts  is  equivalent  to  estimating  the  wave  front  slopes. 

5.3  Maximum  likelihood  shift  estimator  performance 

The  ML  shift  estimator  incorporates  knowledge  of  the  shift  correlation  properties,  the 
light  level  and  the  spot  variance  size.  This  ML  estimation-theoretic  derivation  also  contains 
the  classical  centroid  calculation  method  as  a  byproduct.  For  high  and  low  light  levels,  the 
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ML  shift  estimate  reduces  to  the  classical  centroid  calculation  and  “no  shift”  respectively.  The 
estimator  was  determined  to  be  unbiased.  Finally,  the  MSE  of  the  estimator  was  shown  to  be 
upper  bounded  by  the  MSE  of  the  classical  centroid  method. 


5.4  Recommendations 

The  estimator  derived  in  this  thesis  has  been  briefly  analyzed.  The  estimator  was 
shown  to  be  unbiased  and  the  mean  squared  error  was  calculated.  To  further  characterize 
this  estimator,  simulated  WFS  images  for  various  H-WFS  configurations,  wave  front  slope 
or  spot  shift  statistics,  and  light  levels  must  be  systematically  analyzed.  Additionally,  the 
performance  of  wave  front  reconstruction  techniques  using  this  ML  spot  position  estimator 
should  be  explored  and  then  compared  and  contrasted  to  minimum  variance  and  least-squares 
reconstruction  techniques  such  as  those  analyzed  by  Roggemann  [34]. 
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